Genetic Distance for a General Non-Stationary Markov Substitution Process

نویسندگان

  • Benjamin D. Kaehler
  • Von Bing Yap
  • Rongli Zhang
  • Gavin A. Huttley
چکیده

The genetic distance between biological sequences is a fundamental quantity in molecular evolution. It pertains to questions of rates of evolution, existence of a molecular clock, and phylogenetic inference. Under the class of continuous-time substitution models, the distance is commonly defined as the expected number of substitutions at any site in the sequence. We eschew the almost ubiquitous assumptions of evolution under stationarity and time-reversible conditions and extend the concept of the expected number of substitutions to nonstationary Markov models where the only remaining constraint is of time homogeneity between nodes in the tree. Our measure of genetic distance reduces to the standard formulation if the data in question are consistent with the stationarity assumption. We apply this general model to samples from across the tree of life to compare distances so obtained with those from the general time-reversible model, with and without rate heterogeneity across sites, and the paralinear distance, an empirical pairwise method explicitly designed to address nonstationarity. We discover that estimates from both variants of the general time-reversible model and the paralinear distance systematically overestimate genetic distance and departure from the molecular clock. The magnitude of the distance bias is proportional to departure from stationarity, which we demonstrate to be associated with longer edge lengths. The marked improvement in consistency between the general nonstationary Markov model and sequence alignments leads us to conclude that analyses of evolutionary rates and phylogenies will be substantively improved by application of this model.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Strong Stationary Times for Non-Uniform Markov Chains

This thesis studies several approaches to bounding the total variation distance of a Markov chain, focusing primarily on the strong stationary time approach. While strong stationary times have been used successfully with uniform walks on groups, non-uniform walks have proven harder to analyze. This project applied strong stationary techniques to simple non-uniform walks in the hope of finding s...

متن کامل

Cyclic Equilibria in Markov Games

Although variants of value iteration have been proposed for finding Nash or correlated equilibria in general-sum Markov games, these variants have not been shown to be effective in general. In this paper, we demonstrate by construction that existing variants of value iteration cannot find stationary equilibrium policies in arbitrary general-sum Markov games. Instead, we propose an alternative i...

متن کامل

Stochastic approach to molecular interactions and computational theory of metabolic and genetic regulations.

The underlying molecular mechanisms of metabolic and genetic regulations are computationally identical and can be described by a finite state Markov process. We establish a common computational model for both regulations based on the stationary distribution of the Markov process with the aim of establishing a unified, quantitative model of general biological regulations. Various existing result...

متن کامل

The Spacey Random Walk: A Stochastic Process for Higher-Order Data | SIAM Review | Vol. 59, No. 2 | Society for Industrial and Applied Mathematics

Random walks are a fundamental model in applied mathematics and are a common example of a Markov chain. The limiting stationary distribution of the Markov chain represents the fraction of the time spent in each state during the stochastic process. A standard way to compute this distribution for a random walk on a finite set of states is to compute the Perron vector of the associated transition ...

متن کامل

Eventually-stationary policies for Markov decision models with non-constant discounting

We investigate the existance of simple policies in finite discounted cost Markov Decision Processes, when the discount factor is not constant. We introduce a class called “exponentially representable” discount functions. Within this class we prove existence of optimal policies which are eventually stationary—from some time N onward, and provide an algorithm for their computation. Outside this c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 64  شماره 

صفحات  -

تاریخ انتشار 2015